<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
  "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

<html xmlns="http://www.w3.org/1999/xhtml">
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
    
    <title>PyMOTW: csv &mdash; PyMOTW Document v1.6 documentation</title>
    <link rel="stylesheet" href="../_static/mytest.css" type="text/css" />
    <link rel="stylesheet" href="../_static/pygments.css" type="text/css" />
    <script type="text/javascript">
      var DOCUMENTATION_OPTIONS = {
        URL_ROOT:    '../',
        VERSION:     '1.6',
        COLLAPSE_MODINDEX: false,
        FILE_SUFFIX: '.html',
        HAS_SOURCE:  true
      };
    </script>
    <script type="text/javascript" src="../_static/jquery.js"></script>
    <script type="text/javascript" src="../_static/doctools.js"></script>
    
    <!-- comment utility -->
    <link rel="stylesheet" href="../_static/comment.css" type="text/css" />
    <script type="text/javascript" src="../_static/md5.js"></script>
    <script type="text/javascript" src="../_static/comment.js"></script>
    <!-- end -->
    <link rel="author" title="About these documents" href="../about.html" />
    <link rel="copyright" title="Copyright" href="../copyright.html" />
    <link rel="top" title="PyMOTW Document v1.6 documentation" href="../index.html" />
    <link rel="next" title="PyMOTW: calendar" href="calendar.html" />
    <link rel="prev" title="PyMOTW: sched" href="sched.html" /> 
  </head>
  <body>
    <div class="related">
      <h3>Navigation</h3>
      <ul>
        <li class="right" style="margin-right: 10px">
          <a href="../genindex.html" title="General Index"
             accesskey="I">index</a></li>
        <li class="right" >
          <a href="calendar.html" title="PyMOTW: calendar"
             accesskey="N">next</a> |</li>
        <li class="right" >
          <a href="sched.html" title="PyMOTW: sched"
             accesskey="P">previous</a> |</li>
        <li><a href="../index.html">PyMOTW Document v1.6 documentation</a> &raquo;</li> 
      </ul>
    </div> 
      <div class="sphinxsidebar">
        <div class="sphinxsidebarwrapper">
            <h3><a href="../index.html">Table Of Contents</a></h3>
            <ul>
<li><a class="reference external" href="">PyMOTW: csv</a><ul>
<li><a class="reference external" href="#id1">描述</a></li>
<li><a class="reference external" href="#id2">局限性</a></li>
<li><a class="reference external" href="#id3">读取</a></li>
<li><a class="reference external" href="#id4">写入</a></li>
<li><a class="reference external" href="#id5">引用</a></li>
<li><a class="reference external" href="#dialects">Dialects</a></li>
<li><a class="reference external" href="#dictreader-dictwriter">DictReader 和DictWriter</a></li>
<li><a class="reference external" href="#id7">参考</a></li>
</ul>
</li>
</ul>

            <h4>Previous topic</h4>
            <p class="topless"><a href="sched.html"
                                  title="previous chapter">PyMOTW: sched</a></p>
            <h4>Next topic</h4>
            <p class="topless"><a href="calendar.html"
                                  title="next chapter">PyMOTW: calendar</a></p>
            <h3>This Page</h3>
            <ul class="this-page-menu">
              <li><a href="../_sources/documents/csv.txt"
                     rel="nofollow">Show Source</a></li>
            </ul>
          <div id="searchbox" style="display: none">
            <h3>Quick search</h3>
              <form class="search" action="../search.html" method="get">
                <input type="text" name="q" size="18" />
                <input type="submit" value="Go" />
                <input type="hidden" name="check_keywords" value="yes" />
                <input type="hidden" name="area" value="default" />
              </form>
              <p class="searchtip" style="font-size: 90%">
              Enter search terms or a module, class or function name.
              </p>
          </div>
          <script type="text/javascript">$('#searchbox').show(0);</script>
        </div>
      </div> 

    <div class="document">
      <div class="documentwrapper">
        <div class="bodywrapper">
          <div class="body">
            
  <div class="section" id="pymotw-csv">
<h1>PyMOTW: csv<a class="headerlink" href="#pymotw-csv" title="Permalink to this headline">¶</a></h1>
<ul class="simple">
<li>模块: csv</li>
<li>目的: 对以分号分隔的数值文件进行读写</li>
<li>Python 版本: 2.3+</li>
</ul>
<div class="section" id="id1">
<h2>描述<a class="headerlink" href="#id1" title="Permalink to this headline">¶</a></h2>
<p>csv 模块在处理那些从电子数据表格或数据库中导入到文本文件的数据时, 是很有用的. 这里并没有很好的定义标准, 因此csv模块使用了&#8221;dialects&#8221;, 通过使用不同的参数来解析csv文件. 对于一般的读和写, 这个模块也能处理Microsoft Excel格式数据.</p>
</div>
<div class="section" id="id2">
<h2>局限性<a class="headerlink" href="#id2" title="Permalink to this headline">¶</a></h2>
<p>Python 2.5 版本的csv不支持unicode数据, 而对于ASCII的NUL字符处理也有点问题, 所以推荐使用UTF-8或可打印ASCII字符.</p>
</div>
<div class="section" id="id3">
<h2>读取<a class="headerlink" href="#id3" title="Permalink to this headline">¶</a></h2>
<p>从csv文件中读取数据, 可以使用reader()函数来创建一个读取对象. 这个读取对象顺序处理文件的每一行, 可以把它当成迭代器使用, 例如:</p>
<div class="highlight-python"><div class="highlight"><pre><span class="kn">import</span> <span class="nn">csv</span>
<span class="kn">import</span> <span class="nn">sys</span>

<span class="n">f</span> <span class="o">=</span> <span class="nb">open</span><span class="p">(</span><span class="n">sys</span><span class="o">.</span><span class="n">argv</span><span class="p">[</span><span class="mf">1</span><span class="p">],</span> <span class="s">&#39;rt&#39;</span><span class="p">)</span>
<span class="k">try</span><span class="p">:</span>

    <span class="n">reader</span> <span class="o">=</span> <span class="n">csv</span><span class="o">.</span><span class="n">reader</span><span class="p">(</span><span class="n">f</span><span class="p">)</span>
    <span class="k">for</span> <span class="n">row</span> <span class="ow">in</span> <span class="n">reader</span><span class="p">:</span>
        <span class="k">print</span> <span class="n">row</span>

<span class="k">finally</span><span class="p">:</span>

    <span class="n">f</span><span class="o">.</span><span class="n">close</span><span class="p">()</span>
</pre></div>
</div>
<p>reader()的第一个参数指示源文本行, 在这个例子中, 是一个文件, 但它可以是任何可转换的对象(StringIO对象, lists等). 指定其他可选的参数可用于控制输入的数据如何被解析.</p>
<p>例子文件&#8221;testdata.csv&#8221;是从NeoOffice中导入的, 其内容如下.</p>
<div class="highlight-python"><pre>$ cat testdata.csv
"Title 1","Title 2","Title 3"
1,"a",08/18/07
2,"b",08/19/07
3,"c",08/20/07
4,"d",08/21/07
5,"e",08/22/07
6,"f",08/23/07
7,"g",08/24/07
8,"h",08/25/07
9,"i",08/26/07</pre>
</div>
<p>它被读取时, 输入数据的每一行被转换为一个字符串列表.</p>
<div class="highlight-python"><pre>$ python csv_reader.py testdata.csv
['Title 1', 'Title 2', 'Title 3']
['1', 'a', '08/18/07']
['2', 'b', '08/19/07']
['3', 'c', '08/20/07']
['4', 'd', '08/21/07']
['5', 'e', '08/22/07']
['6', 'f', '08/23/07']
['7', 'g', '08/24/07']
['8', 'h', '08/25/07']
['9', 'i', '08/26/07']</pre>
</div>
<p>如果你知道特定的列具有特定的类型, 你就可以自行转换, 但csv不会自动转换. 它会自动处理嵌入在一行字符串中(这个行和输入源文件的&#8221;行&#8221;意思是不同的)的换行符.</p>
<div class="highlight-python"><pre>$ cat testlinebreak.csv
"Title 1","Title 2","Title 3"
1,"first line ## 这是源文件的一个line
second line",08/18/07

$ python csv_reader.py testlinebreak.csv
['Title 1', 'Title 2', 'Title 3']
['1', 'first line\nsecond line', '08/18/07'] ## 这是csv的一个row</pre>
</div>
</div>
<div class="section" id="id4">
<h2>写入<a class="headerlink" href="#id4" title="Permalink to this headline">¶</a></h2>
<p>当你想把数据导入到其他应用程序中, 对CSV文件的写入也是非常方便的. 使用writer()函数来创建一个写入对象, 对于每一行, 使用writerow()来输出一行.</p>
<div class="highlight-python"><div class="highlight"><pre><span class="kn">import</span> <span class="nn">csv</span>
<span class="kn">import</span> <span class="nn">sys</span>

<span class="n">f</span> <span class="o">=</span> <span class="nb">open</span><span class="p">(</span><span class="n">sys</span><span class="o">.</span><span class="n">argv</span><span class="p">[</span><span class="mf">1</span><span class="p">],</span> <span class="s">&#39;wt&#39;</span><span class="p">)</span>
<span class="k">try</span><span class="p">:</span>

    <span class="n">writer</span> <span class="o">=</span> <span class="n">csv</span><span class="o">.</span><span class="n">writer</span><span class="p">(</span><span class="n">f</span><span class="p">)</span>
    <span class="n">writer</span><span class="o">.</span><span class="n">writerow</span><span class="p">(</span> <span class="p">(</span><span class="s">&#39;Title 1&#39;</span><span class="p">,</span> <span class="s">&#39;Title 2&#39;</span><span class="p">,</span> <span class="s">&#39;Title 3&#39;</span><span class="p">)</span> <span class="p">)</span>
    <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mf">10</span><span class="p">):</span>
        <span class="n">writer</span><span class="o">.</span><span class="n">writerow</span><span class="p">(</span> <span class="p">(</span><span class="n">i</span><span class="o">+</span><span class="mf">1</span><span class="p">,</span> <span class="nb">chr</span><span class="p">(</span><span class="nb">ord</span><span class="p">(</span><span class="s">&#39;a&#39;</span><span class="p">)</span> <span class="o">+</span> <span class="n">i</span><span class="p">),</span> <span class="s">&#39;08/</span><span class="si">%02d</span><span class="s">/07&#39;</span> <span class="o">%</span> <span class="p">(</span><span class="n">i</span><span class="o">+</span><span class="mf">1</span><span class="p">))</span> <span class="p">)</span>

<span class="k">finally</span><span class="p">:</span>

    <span class="n">f</span><span class="o">.</span><span class="n">close</span><span class="p">()</span>
</pre></div>
</div>
<p>这个例子的输出和上述读取例子的导出数据看起来不怎么一样.</p>
<div class="highlight-python"><pre>$ python csv_writer.py testout.csv
$ cat testout.csv
Title 1,Title 2,Title 3
1,a,08/01/07
2,b,08/02/07
3,c,08/03/07
4,d,08/04/07
5,e,08/05/07
6,f,08/06/07
7,g,08/07/07
8,h,08/08/07
9,i,08/09/07
10,j,08/10/07</pre>
</div>
<p>写入对象没有使用默认的引号, 所以每列字符串没有用引号引起来. 但如果增加额外的引用参数即可将非数值数据用引号引起来.</p>
<div class="highlight-python"><div class="highlight"><pre><span class="n">writer</span> <span class="o">=</span> <span class="n">csv</span><span class="o">.</span><span class="n">writer</span><span class="p">(</span><span class="n">f</span><span class="p">,</span> <span class="n">quoting</span><span class="o">=</span><span class="n">csv</span><span class="o">.</span><span class="n">QUOTE_NONNUMERIC</span><span class="p">)</span>
</pre></div>
</div>
<p>现在每个字符串都被引起来了:</p>
<div class="highlight-python"><pre>$ python csv_writer_quoted.py testout_quoted.csv
$ cat testout_quoted.csv
"Title 1","Title 2","Title 3"
1,"a","08/01/07"
2,"b","08/02/07"
3,"c","08/03/07"
4,"d","08/04/07"
5,"e","08/05/07"
6,"f","08/06/07"
7,"g","08/07/07"
8,"h","08/08/07"
9,"i","08/09/07"
10,"j","08/10/07"</pre>
</div>
</div>
<div class="section" id="id5">
<h2>引用<a class="headerlink" href="#id5" title="Permalink to this headline">¶</a></h2>
<p>还有4种不同的引用选项, 它们作为常量定义在csv模块中.</p>
<dl class="docutils">
<dt>QUOTE_ALL</dt>
<dd>不管是什么类型, 任何内容都加上引号</dd>
<dt>QUOTE_MINIMAL</dt>
<dd>这是默认的, 使用指定的字符引用各个域(如果解析器被配置为相同的dialect和选项时, 可能会让解析器在解析时产生混淆)</dd>
<dt>QUOTE_NONNUMERIC</dt>
<dd>引用那些不是整数或浮点数的域. 当使用读取对象时, 如果输入的域是没有引号, 那么它们会被转换成浮点数.</dd>
<dt>QUOTE_NONE</dt>
<dd>对所有的输出内容都不加引用, 当使用读取对象时, 引用字符看作是包含在每个域的值里(但在正常情况下, 他们被当成定界符而被去掉)</dd>
</dl>
</div>
<div class="section" id="dialects">
<h2>Dialects<a class="headerlink" href="#dialects" title="Permalink to this headline">¶</a></h2>
<p>有很多参数可以控制csv模块如何解析或读取数据. 但这不是通过各自传递给读取对象和写入对象相关参数, 而是统一起来, 使用一个&#8221;dialect&#8221;对象. Dialect类可以通过名字注册, 因此csv模块调用它时可以不必预先知道相关的参数设置. 标准库包含两种dialects: excel和excel-tabs. &#8220;excel&#8221; dialect是用于处理默认来自 Microsoft Excel格式的数据的, 同样, 也可以处理 OpenOffice 或 NeoOffice的数据. 更多详细的dialect参数及其使用在csv模块的 <a class="reference external" href="http://docs.python.org/lib/csv-fmt-params.html">节9.1.2</a> 中有说明.      ## dialect就是一些参数(定界符, 换行符等等)设置, 预先设置好的, 但同样我们也可以自己设定,</p>
</div>
<div class="section" id="dictreader-dictwriter">
<h2>DictReader 和DictWriter<a class="headerlink" href="#dictreader-dictwriter" title="Permalink to this headline">¶</a></h2>
<p>另外, 在处理数据序列时, csv模块包含了一些将行作为字典进行处理的类. 类DictReader和类DictWriter将每一行转成字典对象, 可以传递字典键值, 或者从输入文件的第一行中推断出键值.</p>
<div class="highlight-python"><div class="highlight"><pre><span class="kn">import</span> <span class="nn">csv</span>
<span class="kn">import</span> <span class="nn">sys</span>

<span class="n">f</span> <span class="o">=</span> <span class="nb">open</span><span class="p">(</span><span class="n">sys</span><span class="o">.</span><span class="n">argv</span><span class="p">[</span><span class="mf">1</span><span class="p">],</span> <span class="s">&#39;rt&#39;</span><span class="p">)</span>
<span class="k">try</span><span class="p">:</span>

    <span class="n">reader</span> <span class="o">=</span> <span class="n">csv</span><span class="o">.</span><span class="n">DictReader</span><span class="p">(</span><span class="n">f</span><span class="p">)</span>
    <span class="k">for</span> <span class="n">row</span> <span class="ow">in</span> <span class="n">reader</span><span class="p">:</span>
        <span class="k">print</span> <span class="n">row</span>

<span class="k">finally</span><span class="p">:</span>
     <span class="n">f</span><span class="o">.</span><span class="n">close</span><span class="p">()</span>
</pre></div>
</div>
<p>基于字典的读取和写入对象可以当作是基于序列对象的进一步实现, 它们使用相同的参数和API. 唯一的差别就是前者把每一行当成是字典而不是列表或元组.</p>
<div class="highlight-python"><pre>$ python csv_dictreader.py testdata.csv
{'Title 1': '1', 'Title 3': '08/18/07', 'Title 2': 'a'}
{'Title 1': '2', 'Title 3': '08/19/07', 'Title 2': 'b'}
{'Title 1': '3', 'Title 3': '08/20/07', 'Title 2': 'c'}
{'Title 1': '4', 'Title 3': '08/21/07', 'Title 2': 'd'}
{'Title 1': '5', 'Title 3': '08/22/07', 'Title 2': 'e'}
{'Title 1': '6', 'Title 3': '08/23/07', 'Title 2': 'f'}
{'Title 1': '7', 'Title 3': '08/24/07', 'Title 2': 'g'}
{'Title 1': '8', 'Title 3': '08/25/07', 'Title 2': 'h'}
{'Title 1': '9', 'Title 3': '08/26/07', 'Title 2': 'i'}</pre>
</div>
<p>DictWriter必须指定一个域名字的列表, 因为这样它才在输出时知道每个列的顺序.</p>
<div class="highlight-python"><div class="highlight"><pre><span class="kn">import</span> <span class="nn">csv</span>
<span class="kn">import</span> <span class="nn">sys</span>

<span class="n">f</span> <span class="o">=</span> <span class="nb">open</span><span class="p">(</span><span class="n">sys</span><span class="o">.</span><span class="n">argv</span><span class="p">[</span><span class="mf">1</span><span class="p">],</span> <span class="s">&#39;wt&#39;</span><span class="p">)</span>
<span class="k">try</span><span class="p">:</span>

    <span class="n">fieldnames</span> <span class="o">=</span> <span class="p">(</span><span class="s">&#39;Title 1&#39;</span><span class="p">,</span> <span class="s">&#39;Title 2&#39;</span><span class="p">,</span> <span class="s">&#39;Title 3&#39;</span><span class="p">)</span>
    <span class="n">writer</span> <span class="o">=</span> <span class="n">csv</span><span class="o">.</span><span class="n">DictWriter</span><span class="p">(</span><span class="n">f</span><span class="p">,</span> <span class="n">fieldnames</span><span class="o">=</span><span class="n">fieldnames</span><span class="p">)</span>
    <span class="n">headers</span> <span class="o">=</span> <span class="p">{}</span>
    <span class="k">for</span> <span class="n">n</span> <span class="ow">in</span> <span class="n">fieldnames</span><span class="p">:</span>
        <span class="n">headers</span><span class="p">[</span><span class="n">n</span><span class="p">]</span> <span class="o">=</span> <span class="n">n</span>
    <span class="n">writer</span><span class="o">.</span><span class="n">writerow</span><span class="p">(</span><span class="n">headers</span><span class="p">)</span>
    <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mf">10</span><span class="p">):</span>
        <span class="n">writer</span><span class="o">.</span><span class="n">writerow</span><span class="p">({</span> <span class="s">&#39;Title 1&#39;</span><span class="p">:</span><span class="n">i</span><span class="o">+</span><span class="mf">1</span><span class="p">,</span>
                        <span class="s">&#39;Title 2&#39;</span><span class="p">:</span><span class="nb">chr</span><span class="p">(</span><span class="nb">ord</span><span class="p">(</span><span class="s">&#39;a&#39;</span><span class="p">)</span> <span class="o">+</span> <span class="n">i</span><span class="p">),</span>
                        <span class="s">&#39;Title 3&#39;</span><span class="p">:</span><span class="s">&#39;08/</span><span class="si">%02d</span><span class="s">/07&#39;</span> <span class="o">%</span> <span class="p">(</span><span class="n">i</span><span class="o">+</span><span class="mf">1</span><span class="p">),</span>
                        <span class="p">})</span>

<span class="k">finally</span><span class="p">:</span>

    <span class="n">f</span><span class="o">.</span><span class="n">close</span><span class="p">()</span>
</pre></div>
</div>
<div class="highlight-python"><pre>$ python csv_dictwriter.py testout.csv
$ cat testout.csv
Title 1,Title 2,Title 3
1,a,08/01/07
2,b,08/02/07
3,c,08/03/07
4,d,08/04/07
5,e,08/05/07
6,f,08/06/07
7,g,08/07/07
8,h,08/08/07
9,i,08/09/07
10,j,08/10/07</pre>
</div>
</div>
<div class="section" id="id7">
<h2>参考<a class="headerlink" href="#id7" title="Permalink to this headline">¶</a></h2>
<ul class="simple">
<li><a class="reference external" href="http://www.doughellmann.com/projects/PyMOTW/">Python Module of the Week Home</a></li>
<li><a class="reference external" href="http://www.doughellmann.com/downloads/PyMOTW-1.14.tar.gz">Download Sample Code</a></li>
<li><a class="reference external" href="http://www.python.org/peps/pep-0305.html">PEP 305, CSV File API</a></li>
</ul>
</div>
</div>


          </div>
        </div>
      </div>
      <div class="clearer"></div>
    </div>
    <div class="related">
      <h3>Navigation</h3>
      <ul>
        <li class="right" style="margin-right: 10px">
          <a href="../genindex.html" title="General Index"
             >index</a></li>
        <li class="right" >
          <a href="calendar.html" title="PyMOTW: calendar"
             >next</a> |</li>
        <li class="right" >
          <a href="sched.html" title="PyMOTW: sched"
             >previous</a> |</li>
        <li><a href="../index.html">PyMOTW Document v1.6 documentation</a> &raquo;</li> 
      </ul>
    </div>
    <div class="footer">
      &copy; <a href="../copyright.html">Copyright</a> 2009, vbarter &amp; liz.
      Created using <a href="http://sphinx.pocoo.org/">Sphinx</a> 0.6.3.
    </div>
  </body>
</html>